90 research outputs found
Jointly Optimal Routing and Caching for Arbitrary Network Topologies
We study a problem of fundamental importance to ICNs, namely, minimizing
routing costs by jointly optimizing caching and routing decisions over an
arbitrary network topology. We consider both source routing and hop-by-hop
routing settings. The respective offline problems are NP-hard. Nevertheless, we
show that there exist polynomial time approximation algorithms producing
solutions within a constant approximation from the optimal. We also produce
distributed, adaptive algorithms with the same approximation guarantees. We
simulate our adaptive algorithms over a broad array of different topologies.
Our algorithms reduce routing costs by several orders of magnitude compared to
prior art, including algorithms optimizing caching under fixed routing.Comment: This is the extended version of the paper "Jointly Optimal Routing
and Caching for Arbitrary Network Topologies", appearing in the 4th ACM
Conference on Information-Centric Networking (ICN 2017), Berlin, Sep. 26-28,
201
Truthful Linear Regression
We consider the problem of fitting a linear model to data held by individuals
who are concerned about their privacy. Incentivizing most players to truthfully
report their data to the analyst constrains our design to mechanisms that
provide a privacy guarantee to the participants; we use differential privacy to
model individuals' privacy losses. This immediately poses a problem, as
differentially private computation of a linear model necessarily produces a
biased estimation, and existing approaches to design mechanisms to elicit data
from privacy-sensitive individuals do not generalize well to biased estimators.
We overcome this challenge through an appropriate design of the computation and
payment scheme.Comment: To appear in Proceedings of the 28th Annual Conference on Learning
Theory (COLT 2015
Learning Mixtures of Linear Classifiers
We consider a discriminative learning (regression) problem, whereby the
regression function is a convex combination of k linear classifiers. Existing
approaches are based on the EM algorithm, or similar techniques, without
provable guarantees. We develop a simple method based on spectral techniques
and a `mirroring' trick, that discovers the subspace spanned by the
classifiers' parameter vectors. Under a probabilistic assumption on the feature
vector distribution, we prove that this approach has nearly optimal statistical
efficiency
Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization
Recommender systems leverage user demographic information, such as age,
gender, etc., to personalize recommendations and better place their targeted
ads. Oftentimes, users do not volunteer this information due to privacy
concerns, or due to a lack of initiative in filling out their online profiles.
We illustrate a new threat in which a recommender learns private attributes of
users who do not voluntarily disclose them. We design both passive and active
attacks that solicit ratings for strategically selected items, and could thus
be used by a recommender system to pursue this hidden agenda. Our methods are
based on a novel usage of Bayesian matrix factorization in an active learning
setting. Evaluations on multiple datasets illustrate that such attacks are
indeed feasible and use significantly fewer rated items than static inference
methods. Importantly, they succeed without sacrificing the quality of
recommendations to users.Comment: This is the extended version of a paper that appeared in ACM RecSys
201
A Family of Tractable Graph Distances
Important data mining problems such as nearest-neighbor search and clustering
admit theoretical guarantees when restricted to objects embedded in a metric
space. Graphs are ubiquitous, and clustering and classification over graphs
arise in diverse areas, including, e.g., image processing and social networks.
Unfortunately, popular distance scores used in these applications, that scale
over large graphs, are not metrics and thus come with no guarantees. Classic
graph distances such as, e.g., the chemical and the CKS distance are arguably
natural and intuitive, and are indeed also metrics, but they are intractable:
as such, their computation does not scale to large graphs. We define a broad
family of graph distances, that includes both the chemical and the CKS
distance, and prove that these are all metrics. Crucially, we show that our
family includes metrics that are tractable. Moreover, we extend these distances
by incorporating auxiliary node attributes, which is important in practice,
while maintaining both the metric property and tractability.Comment: Extended version of paper appearing in SDM 201
- …